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(54) (Method for providing motion-compensated multi-field enhancement off still images from 
video 



(57) A method and system for combining the infor- 
mation from one video field, or multiple video fields into 
a single, high quality still image. A reference field and 
auxiliary fields are selected and an orientation map is 
constructed for the reference field. Motion maps are 
constructed to model displacement between the refer- 
ence and auxiliary fields. The auxiliary fields are direc- 
tionally interpolated using orientation maps. A merge 
mask is used to mask of certain pixels which should not 
be used in the final enhanced image. A weighted aver- 
age is then formed from the reference field pixels which 
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have not been masked off. A final still image is obtained 
after additional horizontal interpolation. Post-processing 
might be used to further sharpen the image. The 
method and system are applicable to both the lumi- 
nance and chrominance components of the video 
image. The method and system serve to reduce the 
noise, as well as the luminance and color aliasing arti- 
facts associated with the reference field, while enhanc- 
ing its resolution, by utilizing information from the 
auxiliary fields. 
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Description 

Field of the Invention 

5 [0001 ] This invention relates to a method and system for combining the information from multiple video fields into a 
single, high quality still image. 

Background of the Invention 

10 [0002] Individual fields from video sources generally exhibit the following shortcomings: 

sensor, tape and transmission noise; 

luminance aliasing due to insufficiently dense spatial sampling of the optical scene; 

15 

chrominance aliasing due to insufficiently dense spatial sampling of particular color components in the optical 
scene (often occurs with single CCD video cameras which can only sense one color component at each pixel posi- 
tion); 

20 relatively poor resolution. 

[0003] However, video sources have the advantage that many pictures of the same scene are available, usually with 
relatively small displacements of the scene elements between consecutive fields. After suitable compensation for 
motion, these multiple pictures can be combined to produce a still image with less noise. Perhaps more importantly, 

25 however, the existance of motion allows for effectively having a denser sampling of the optica! scene than is available 
from any single field. This opens up the possibility for aliasing removal as well as resolution enhancement 
[0004] While analog video is considered, many of the following observations also apply to a variety of digital video 
sources. One observation is that the resolution of the chrominance components is significantly lower than that of the 
luminance components. Specifically, the horizontal chrominance resolution of an NTSC (National Television System 

30 Standard) broadcast video source is about | that of the luminance. Also, although the NTSC standard does not limit 
the vertical resolution of the chrominance components below that of the luminance components, most popular video 
cameras inherently halve the vertical chrominance resolution, due to their single CCD design. Since the chrominance 
components carry very little spatial information in comparison to the luminance component, a process might focus res- 
olution enhancement efforts on the luminance channel alone. Moreover, the computational demand of the multi-field 

35 enhancement system can be reduced by working with a coarser set of chrominance samples than that used for the 
luminance component 

[0005] A second observation concerning analog video is that the luminance component is often heavily aliased in the 
vertical direction, but much less so in the horizontal direction. This is to be expected, since the optical bandwidth is 
roughly the same in both the horizontal and vertical directions, but the vertical sampling density is less than half the hor- 

40 izontal sampling density. Moreover, newer video cameras employ CCD sensors with an increasing number of sensors 
per row, whereas the number of sensor rows is set by the NTSC standard. Empirical experiments confirm the expecta- 
tion that high horizontal frequencies experience negligible aliasing, whereas high vertical frequencies are subjected to 
considerable aliasing. Hence, it is unlikely to be possible to increase the horizontal resolution of the final still image 
through multi-field processing; however, it should be possible to "unwrap" aliasing components to enhance the vertical 

45 resolution and remove the annoying aliasing artifacts fjaggies") around non-vertical edges. 

[0006] Hence, what is needed is a method and system for combining the information from multiple video fields into a 
single, high quality still image. 

Summary of the Invention 

so 

[0007] This invention disclosure describes a system for combining the information from multiple video fields into a sin- 
gle high quality still image. One of the fields is selected to be the reference and the remaining fields are identified as 
auxiliary fields. The system reduces the noise, as well as the luminance and color aliasing artifacts associated with the 
reference field, while enhancing its resolution, by utilizing information from the auxiliary fields. 
55 [0008] An orientation map is constructed for the reference field and is used to directionally interpolate this field up to 
four times the vertical field resolution. 

[0009] Motion maps are constructed to model the local displacement between features in the reference field and cor- 
responding features in each of the auxiliary fields. Motion is computed to quarter pixel accuracy in the vertical direction 
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and half pixel accuracy in the horizontal direction, using the directionally interpolated reference field to accomplish the 
sub-pixel search. The motion maps are used firstly to infer an orientation map for each of the auxiliary fields directly 
from the reference field's orientation map (note that orientation maps could be computed for each field separately, if the 
computational demand were not considered excessive) and later to guide incorporation of information from the auxiliary 
fields into the reference field. 

[001 0] The auxiliary fields are then directionally interpolated to the same resolution as the interpolated reference field, 
using their inferred orientation maps. 

[0011] A merge mask is determined for each auxiliary field to mask off pixels which should not be used in the final 
enhanced still image; the masked off pixels generally correspond to regions where the motion maps fail to correctly 
model the relationship between the reference and auxiliary fields; such regions might involve uncovered background, 
for example. 

[0012] A weighted average is formed from the reference field pixels and the motion-compensated auxiliary field pixels 
which have not been masked off. The weights associated with this weighted averaging operation are spatially varying 
and depend upon both the merge masks and the displacements recorded in the motion maps. Unlike conventional field 
averaging techniques, this approach does not destroy available picture resolution in the process of removing aliasing 
artifacts. 

[001 3] The final still image is obtained after horizontal interpolation by an additional factor of two (to obtain the correct 
aspect ratio after the fourfold vertical interpolation described above) and an optional post-processing operation which 
sharpens the image formed from the weighted averaging process described above. The above processing steps are 
modified somewhat for the chrominance components to reflect the fact that these components have much less spatial 
frequency content than the luminance component. 

[0014] An important property of this image enhancement system is that it can work with any number of video fields 
at all. If only one field is supplied, the system employs the sophisticated directional interpolation technique mentioned 
above. If additional fields are available, they are directionally interpolated and merged into the interpolated reference 
field so as to progressively enhance the spatial frequency content, while reducing noise and other artifacts. In the spe- 
cial case where two fields are available, the system may also be understood as a "de-interlacing" tool. 
[0015] Other advantages of this invention will become apparent from the following description taken in conjunction 
with the accompanying drawings which set forth, by way of illustration and example, certain embodiments of this inven- 
tion. The drawings constitute a part of this specification and include exemplary embodiments, objects and features of 
the present invention. 

Brief Description of the Drawings 

[0016] 

Figure 1 shows a Block structure used for motion estimation and field merging: a) non-overlapping segmentation 
of reference field; b) overlapping motion blocks surrounding each segmentation block. 

Figure 2 shows the eight orientation classes and their relationship to the target luminance pixel with which they are 
associated. 

Figure 3 shows a table of orthogonal likelihood values, L t . for each of the directed orientation classes, C. 
Figure 4 shows directional low-pass filters applied to the pre-conditioned reference field's luminance component to 
prepare for calculation of the orientation class likelihood values: a) L v , L v . and L v+ ; b) D'\ c) D+; d) O'; and e) 0 + . 
Figure 5 shows intermediate linear combinations, u 1t u 2 , u 3 and u 4l of horizontally pre-filtered luminance peels 
used to form the vertical unlikelihood value, U v . 

Figure 6 shows horizontally pre-filtered luminance pixels used to form the near vertical unlikelihood values: a) U v . 
and b) U v +. 

Figure 7 shows intermediate linear combinations, d^ , of diagonally pre-filtered luminance pixels, used to form the 
diagonal unlikelihood values: a) U D . and b) U D+ . 

Figure 8 shows intermediate linear combinations, , of near vertically pre-filtered luminance pixels, used to form 
the oblique unlikelihood values: a) U 0 . and b) Uq+. 

Figure 9 shows neighboring class values used to form the smoothed orientation class. T5 m jt . associated with the 
target pixel at row m and column n. 

Figure 10 shows an example of the linear directional interpolation strategy used to recover the three missing lumi- 
nance samples, Y 4m ^ n , Y Am+2 , n ^ Y <m*3,n^om neighboring original field rows. In this example, the orientation 
class is C mn =V+. 
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Detailed Description of the Preferred Embodiment 

[001 7] It should be understood that while certain forms of the invention are illustrated, they are not to be limited to the 
specific forms or arrangements of parts herein described and shown. It will be apparent to those skilled in the art that 
5 various changes may be made without departing from the scopeof the invention and the invention is not to be consid- 
ered limited to chat is shown in the drawings and descriptions. 

10018] To facilitate the discussion which follows, let H F and W F denote the number of rows (height) and columns 
(width) of each digitized video field. Many video digitizers produce fields with W F ■ 640 columns and H F = 240 rows, 
but this need not be the case. The multi-field processing system directionally interpolates the reference field to a reso- 
10 lution of H t - 4H F by W, = W F (i.e. vertical expansion by a factor of 4) and then enhances the vertical information 
content by adaptively merging the directionally interpolated, motion compensated and appropriately weighted auxiliary 
fields into this interpolated reference field. This adaptive merging process also serves to remove aJiasing and reduce 
noise. 

[0019] It should be noted that these dimensions describe only the luminance component of the video signal. The 
is chrominance components are treated differently. Original chrominance fields each have H F rows, but only Wjp/4 col- 
umns. The video digitization and decoding operations may produce chrominance components with these resolutions, 
or else the process may decimate the chrominance components of a collection of video fields which have already been 
decoded. In this way the process reduces the memory requirements and computational demand associated with the 
multi-field enhancement operation, without sacrificing actual information. The multi-field processing system direction- 
20 ally interpolates the reference field's chrominance components to a resolution of H ( /2 = 2H F by W { /4 ■ W F IA (i.e. 
vertical expansion by a factor of 2) and then adaptively merges the directionally interpolated and motion compensated 
auxiliary fields' chrominance components into the reference field to reduce chrominance noise and artifacts. Note that 
the chrominance components from the various fields are merged by simple averaging, after invalid pixels from regions 
which do not conform to the estimated inter-field motion model have been mashed off. This temporal averaging is able 
25 to reduce noise and color aliasing artifacts but is not able to enhance the spatial frequency content of the image. The 
luminance components from the various fields, however, are merged using a spatially varying weighted average, whose 
weights are computed from the estimated inter-field motion so as to remove aliasing while enhancing the spatial fre- 
quency content of the image. 

[0020] The final image produced by the system has H = H, = 4H F rows by W = 2 W, - 2 W F columns. It is formed 
30 by doubling the horizontal resolution of the luminance component and quadrupling the horizontal resolution and dou- 
bling the vertical resolution of the chrominance components produced by the method described above. These opera- 
tions are required to restore the luminance component to the correct aspect ratio and to obtain a full set of chrominance 
sample values at every pixel position. In the preferred embodiment of the invention, horizontal doubling of the luminance 
resolution may be achieved by applying the interpolation filter kernel. 

35 

<8'8'8'8 ; V ' 



40 [0021 ] This kernel has been selected to preserve the horizontal frequency response of the original video signal, while 
allowing for a multiplication-free implementation. The same interpolation kernel is used to expand the horizontal chromi- 
nance resolution by a factor of two. after which the chrominance components are expanded by an additional factor of 
two in both ejections using conventional bilinear interpolation. 

[0022] Section 2 below discloses a method to estimate local orientations within the reference field, along with an inter- 
45 polation procedure used to directionally interpolate the reference and auxiliary fields according to the estimated orien- 
tation map. Section 1 disdoses a method to obtain the motion maps between the reference and auxiliary fields. Finally, 
Section 3 discloses methods to build merge masks and merge weighting factors, along with the fast algorithm used to 
actually merge the reference and auxiliary fields into an enhanced still image. 

so 1 Motion Estimation between Reference and Auxiliary Fields 

[0023] The reference field is first segmented into non-overlapping blocks which have approximately 1 5 field rows and 
23 field columns each in the current implementation. This segmentation is depicted in Figure 1a. Each of these 
segmentation blocks 10 is surrounded by a slightly larger motion block 12. as depicted in Figure 1b. Adjacent motion 
55 blocks overlap one another by two field rows 14 and four field columns 16 in the current implementation. The motion 
estimation sub-system is responsible for computing a single motion vector for each motion block, for each auxiliary field. 
The motion vector is intended to describe the displacement of scene objects within the motion block, between the ref- 
erence and the relevant auxiliary field. These motion vectors are used to guide the process described in Section 3, 
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whereby Interpolated reference and auxiliary fields are merged into a single image. This merging process is performed 
independently on each motion block, after which the merged motion blocks are stitched together to form the final image. 
A smooth weighting function is used to average overlapping regions from adjacent motion blocks. The purpose of this 
section is to describe only the method used to estimate motion vectors for any given motion block. 

5 [0024] For each of the auxiliary fields, the system processs the motion blocks in lexicographical fashion. Estimation 
of the motion vector for each of these blocks proceeds in three distinct phases. The first phase attempts to predict the 
motion vector based on the motion vectors obtained from previously processed, neighboring motion blocks. The predic- 
tion strategy is described in Section 1.1. This predicted motion vector is used to bias the efficient and robust coarse 
motion estimation technique, described in Section 1 .2, which estimates motion to full pixel accuracy only The final 

w refinement to quarter pixel accuracy in the vertical direction and half pixel accuracy in the horizontal direction is per- 
formed using a conventional MAD block matching technique. Details are supplied below in Section 1.3. 

1.1 Motion Prediction 

75 [0025] To facilitate the discussion, let 

20 denote the coarse - i.e. pixel resolution - motion vector for the (m t n)th motion block, i.e. the n'th motion block in the mth 
row of motion blocks associated with the reference field. Since motion vectors are estimated in lexicographical order, 
the neighboring coarse motion vectors, y m n . A , y m ^^ t y m .\ n and y ^.^n+i . have already been estimated and can 
be used to form an initial prediction for y m ' n . In particular, the motion estimation sub-system sets the predicted vector, 
y m n , to be the arithmetic mean of the three least disparate of these four neighboring motion vectors. For the purpose 

25 of this computation, the disparity among any collection of three motion vectors, y 1 , y 2 and y 3 , is defined as 




30 

i.e. the sum of the Z. 1 distances between each of the vectors and their arithmetic mean. 

[0026] One reason for forming this prediction, y m n , is not to reduce the full motion vector search time, but rather to 
encourage the development of smooth motion maps. In regions of the scene which contain little information from which 
35 to estimate motion, we prefer to adopt the predicted vectors whenever this is reasonable. Otherwise, the "random" 
motion vectors usually produced by block motion estimation algorithms in these regions can cause annoying visual arti- 
facts when attempts are made to merge the reference and auxiliary fields into a single still image. 

1.2 Coarse Motion Estimation 

40 

[0027] In general, the process produces a coarse - i.e. full pixel accuracy - estimate of the motion, y m between the 
reference field and a given auxiliary field, for the (m,n)1h motion block. Moreover, we would like to bias the estimate 
towards the predicted vector, y m n , whenever this is consistent with the features found in the two fields. The coarse 
motion estimation sub-system performs a full search which, in one embodiment of the system, involves a search range 
« of |u m, n I ^ 1 0 field columns and |u k, n I ^ 5 field rows. For each vector, y , in this range, the process computes an objec- 
tive function, C?(y ), which will be discussed below. To facilitate the discussion, let Omin denote the minimum value 
attained by 0(y ) over the search range and let v denote the set of all motion vectors, y , such that 

Ote)<0min + T mi 

50 

where T m is a pre-defined threshold. The final coarse motion estimate, y m n , is taken to be the vector, y e v. which is 
closest to the predicted vector, y m rr Here, the L 1 distance metric is used, so that the distance between y and y m n is 

[0028] The actual objective function, 0(y ), used for coarse motion estimation, is disclosed as follows. Rather than 
using the computationally intensive Maximum Absolute Distance (MAD) objective, the process constructs the objective 
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function from a novel 2 bit per pixel representation of the reference and auxiliary fields. Specifically, the luminance com- 
ponents of the reference and auxiliary fields are first pre-conditioned with a spatial bandpass f flter, as described in Sec- 
tion 1.2.1; a two bit representation of each bandpass filtered luminance sample is formed by a simple thresholding 
technique, described in Section 1.2.2; and then the objective function is evaluated by a combination of "exclusive or" 
5 (XOR) and counting operations which are applied to the two bit pixel representations, as described in Section 1 .2.3. 

1.2.1 Bandpass Filtering for Coarse Motion Estimation 

[0029] A simple bandpass filter is constructed, in one embodiment, by taking the difference of two moving window 
10 lowpass filters. Specifically, let y[i,j\ denote the luminance sample from any given field at row / and column j. The band- 
pass filtered pixel, y[i t j\, is computed according to 



[0030] Here. L* and L y and the width and height of the "local-scale" moving average window, while W and W are 
the width and height of the "wide-scale" moving average window. The scaling operations may be reduced to shift oper- 

25 ations by ensuring that each of these four dimensions is a power of 2, in which case the entire bandpass filtering oper- 
ation may be implemented with four additions, four subtractions and two shifts per pixel. In our particular 
implementation, the dimensions, L * = L r = 4 , = 32 and W = 1 6 were found to optimize the robustness of the over- 
all motion estimation scheme. It is worth noting that this bandpass filtering operation desensitizes the motion estimator 
to inter-field illumination variations, as well as to high frequency aliasing artifacts in the individual fields. At the same 

30 time it produces pixels which have zero mean - a key requirement for the generation of useful two bit per pixel repre- 
sentations according to the method described in the following section. 

1.2.2 Two Bit Pixel Representation tor Coarse Motion Estimation 

35 [0031] After bandpass filtering, each filtered sample, y[/.y]. is assigned a two bit representation in the preferred 
embodiment of the invention, where this representation is based on a parameter, T b . The first bit is set to 1 if y[i,j\ >T b 
and 0 otherwise, while the second bit is set to 1 if ~y[i.j] < -T b and 0 otherwise. This two bit representation entails some 
redundancy in that it quantizes y[i.j\ into only three different regimes. The representation has the following important 
property, however. If Q (y n , yg) represents the total number of 1 bits in the two bit result obtained by taking the exclusive 

40 or of corresponding bits in the two bit representations associated with and y 2 , then it is easy to verify that Q [y^ ,y 2 ) 
satisfies the following relationship. 
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0, h > T b and y 2 > T b 

0, -T b < y x < T b and - T h < y 7 < T b 

0, y x < -T b and y 2 < ~T b 

1, y, > Ti and - T b < y 2 < T b 
1. V7 > T b and -T* < yi < T b 
1, yi < -7* and - T 6 < $2 < T b 

1, jb<-T 6 and -T 6 <yi<T 6 

2, y t > T 6 and y 2 < -T 6 
,2, h > T b and y x < -T 6 



20 



[0032] Thus, Q[y-\,Y2) may be Interpreted as a measure of the distance between y n and y 2 . 



1.2.3 The Coarse Motion Estimation Objective Function 



25 [0033] Our objective function, O ( Y ), is constructed by taking the sum of the two bit distance metric, 



Q(yrM,y.(«' + t.»,; +**]), 



30 over all pixels, (/,)}, which lying within a coarse matching block; this coarse matching block is generally larger than the 
motion block itself. Here y^/j] is the bandpass filtered sample at row / and column ;" of the reference field, while y B {ij[ 
is the bandpass filtered sample at row / and column / of the auxiliary field. In our implementation the coarse matching 
block consists of 20 field rows by 32 field columns, surrounding the motion block of interest. 

35 1 .3 Refinement to Sub-Pixel Accuracy 

[0034] In one embodiment of the invention, a conventional MAD (Mean Absolute Difference) search is performed, with 
a search range of one field column and half a field row around the vector returned by the coarse motion estimation sub- 
system, searching in increments of half the field column separation and a quarter of the field row separation. Only the 
40 reference field need be interpolated to the higher resolution (four times vertical and twice horizontal resolution) in order 
achieve this sub-pixel accuracy search. The auxiliary fields are pre-conditioned by applying a five tap vertical low-pass 
filter with kernel, 



prior to performing the motion refinement search. This low-pass filtering reduces sensitivity to vertical aliasing. 
so 2 Directional Interpolation of each Field 
2.1 Orientation Estimation 

[0035] The object of orientation estimation is to identify the direction of image edges in the neighborhood of any given 
55 pixel so the intra-f ield vertical interpolation operation described in Section 2.2 below can be careful to interpolate along 
rather than across an edge. In addition to correct identification of edge orientation, one key requirement for the estima- 
tor is that the resulting orientation map be as smooth as possible. It is preferred that the orientation map not fluctuate 
wildly in textured regions or in smooth regions which lie near to actual image edges, because such fluctuations might 



45 
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manifest themselves as visually disturbing artifacts after interpolation. It is therefore important to control the numerical 
complexity of the estimation technique. 

[0036] The orientation estimation sub-system works with the luminance component of the reference field. For each 
"target" luminance pixel in this field, an orientation class is selected. The orientation class associated with a given target 
5 row and target column is to be interpreted as the dominant feature orientation observed in a neighborhood whose cen- 
troid lies between the target and next field rows and between the target and next field columns. This centroid is marked 
with a cross pattern 20 in Figure 2. The figure also illustrates the set of eight orientation classes which may be associ- 
ated with each target luminance pixel; they are: 

10 22 N: No distinct orientation. 

24 V: Distinct orientational feature at 90° (vertical). 

26 V: Distinct orientational feature at 63° (near vertical) from top-left to bottom-right. 

15 

28 V s ": Distinct orientational feature at 63° (near vertical) from top-right to bottom-left. 
30 D'\ Distinct orientational feature at 45° (diagonal) from top-left to bottom-right. 
20 32 D+; Distinct orientational feature at 45° (diagonal) from top-right to bottom-left 
34 O": Distinct orientational feature at 27° (oblique) from top-left to bottom-right. 
36 0 + : Distinct orientational feature at 27° (oblique) from top-right to bottom-left. 

25 

[0037] The orientation class associations for each luminance pixel in the reference field constitute the orientation map. 
[0038] The estimation strategy consists of a number of elements, whose details are discussed separately below. 
Essentially, a numerical value, L c , is assigned to each of the distinctly oriented classes, C e [Vy t ^D'.D^O'.Cr}, 
which is to be interpreted as the likelihood that a local orientation feature exists with the corresponding orientation. The 

30 estimated orientation class is tentatively set to the distinct orientation class, C, which has the maximum value, L c . The 
likelihood value, L c , for the selected class is then compared with an orthogonal likelihood value, L £ , which represents 
the likelihood associated with the orthogonal direction. If the difference between L c and Lc is less than a predeter- 
mined threshold, the orientation class is set to N, i.e. no distinct orientation. The orthogonal likelihood value is obtained 
from the table of Figure 3. The orientation map obtained in the manner described above, is subjected to a final morpho- 

35 logical smoothing operation to minimize the number of disturbing artifacts produced during directional interpolation. 
This smoothing operation is described in Section 2. 1 .3. 

[0039] To compute the likelihood values, Lc, for each directed orientation class, C, the luminance pixels are proc- 
essed first, using a directional low-pass filter which smooths in a direction which is approximately perpendicular to that 
of C. L c is then based on a Total Variation (TV) metric, computed along a trajectory which is parallel to the orientation 
40 of C; the larger the variation, the smaller the likelihood value. The seven directional filtering operations are described in 
Section 2.1 .1 , while the TV metric is described in Section 2.1 .2 below. 

2.1.1 Oriented Pre-Filtering of the Luminance Field 

45 [0040] The reference field's luminance component is first pre-conditioned by applying a vertical low-pass filter with 
the three tap kernel, 



[0041] One purpose of this pre-conditioning fflter is to reduce the influence of vertical aliasing artifacts which can 
adversely affect the estimation sub-system. 

[0042] To prepare the pre-conditioned luminance field for calculation of vertical likelihood values. L v , and near vertical 
55 likelihood values, L v _ and L v +, we apply the horizontal low-pass filter whose five taps are illustrated in Figure 4a. 

[0043] To prepare the pre-conditioned luminance field for calculation of the diagonal likelihood values, Lq_, we apply 
the diagonal low-pass filter whose three taps 42 are illustrated in Figure 4b. The complementary fflter, whose three taps 
44 are illustrated in Figure 4c, is used to prepare for calculation of the complementary diagonal likelihood values, L D+ . 
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[0044] Finally, to prepare the pre-conditioned luminance field for calculation of the oblique like-lihood values. L Q _ and 
L 0+ , we apply the near vertical low-pass filters 46,48 illustrated in Figures 4d and 4e, respectively. 

2.1.2 The Directional TV Metric 

5 

[0045] For each directed orientation class, C, the likelihood values, Lq, are found by negating a set of corresponding 
"unlikelihood" values, U c . The unlikelihood values for each directed orientation class are computed by a applying an 
appropriate "total variation" measure to the reference field's luminance component, after pre-conditioning and appropri- 
ate directional pre-fittering, as described in Section 2.1.1 above. 
10 [0046] The vertical unlikelihood value is calculated from 

Vv = 6 • |1* - t*| + 2 ■ |n - t*| + 2 . |*3 - r«|, 

15 

where i>i, ua, u 3 and u 4 are linear combinations of pixels from the pre-conditioned and horizontally pre-filtered lumi- 
nance field; these linear combinations are depicted in Figure 5. The centroid of this calculation lies halfway between the 
target 50 and next field row 52 and halfway between the target 54 and next field column 56, which is in agreement with 
Figure 2. 

20 [0047] The near vertical unlikelihood values are calculated from 

Uv- = 4 • K - *3 | + 3 • |ff - | + 3 • |i?3 - v 4 "| 

25 and 

V v . £ 4 • |„+ - » 3 + | + 3 • |v+ - v}\ +. 3 • \v* - v*\. 



30 where the u* terms represent pixel values from the pre-conditioned and horizontally pre-filtered luminance field; the rel- 
evant pixels are depicted in Figures 6a and 6b. Again, the centroid of these calculations lies half a field row below and 
half a field column to the right of the target field row 60 and column 62, as required for consistency with the definition of 
the orientation classes. 

[0048] The diagonal unlikelihood values are calculated from 

35 

= \ d T " <*3 I + 2 * \ d 2 - rf 4 I + 2 ' \ d 3 ~ d s\ + 2 * \ d 4 - d 6 I + 2 ' \ d S ' d f I + \ d 6 " d a\ 

and 

40 

U D +±\dt-dt\ + 2.\4-dt\ + 2-\4-dt\+2-\dt-dt\ + 2-\dt-dt\ + \4-dt\, 

where the c/f terms each represents a linear combination of two pixel values from the pre-conditioned and diagonally 
45 pre-filtered luminance field. The d] terms are formed after applying the diagonal pre-f ilter shown in Figure 4b. while the 
d| terms are formed after applying the diagonal pre-filter shown in Figure 4c. The pixels and weights used to form the 
d] and d] terms are illustrated in Figures 7a and 7b, respectively. Notice that the centroid of these calculations again 
lies half a field row below and half a field column to the right of the target field row 70 and column 72, as required for 
consistency with the definition of the orientation classes. 
so [0049] Finally, the oblique unlikelihood values are calculated from 
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U 0 - = 2 • (of - oJ| + 2 • K - | + 2 • |<£ - Oj | + 2 • K - | + 2 - |oJ - of| 

5 5111(1 

C/ 0 * i2>+-o+|+2 |o+-*+|+^^ 

jo where the o* terms each represents a linear combination of two pixel values from the pre-conditioned and near verti- 
cally pre-filtered luminance field. The o ] terms are formed after applying the near vertical pre-filter shown in Figure 4d p 
while the o f terms are formed after applying the near vertical pre-filter shown in Figure 4e. The pixels and weights used 
to form the o i and oT terms are illustrated in Figures 8a and 8b, respectively. Notice that the centroid of these calcu- 
lations again lies half a field row below and half a field column to the right of the target field row 80 and column 82, as 

15 required for consistency with the definition of the orientation classes. 

2.1.3 Morphological Smoothing of the Orientation ft/lap 

[0050] The morphological smoothing operator takes, as its input the initial classification of each pixel in the reference 
20 field into one of the orientation classes, V, V, V*. 0", D\ Or or 0 + , and produces a new classification which generally 
has less inter-class transitions. To facilitate the discussion, let C m n denote the initial orientation classification associ- 
ated with target field row m 90 and target field column n 92. The smoothing operator generates a potentially different 
classification, C m n , by considering C mn together with the 14 neighbors 94 depicted in Figure 9. The smoothing policy 
is that the value of Z m n should be identical to C m n , unless either a majority of the 6 neighbors 96 lying to the left of the 
25 target pixel and a majority of the 6 neighbors 98 lying to the right of the target pixel all have the same classification, C, 
or a majority of the 5 neighbors 100 lying above the target pixel and a majority of the 5 neighbors 102 lying below the 
target pixel all have the same classification, C. In either of these two cases, the value of ~C m n is set to C. 

2J2 Interpolation 

30 

[0051] This section describes the directional interpolation strategy which is used to quadruple the number of lumi- 
nance rows and double the number of chrominance rows. To facilitate the ensuing discussion, let Y 4m n in Figure 10 
denote the luminance pixel at field row m and field column n. The purpose of luminance interpolation is to derive three 
new luminance rows, Y Am ^ ifV Y Am + 2n a°d ^4m+3,n. between every pair of original luminance rows, Y Am n 120 and 

35 ^4m+4,n 122. Similarly, the purpose of chrominance interpolation is to derive one new chrominance row, C 2m +i,/c. 
between every pair of original chrominance rows, C 2mk and C 2m + 2k . Only one of the chrominance components is 
explicitly referred to, with the understanding that both chrominance components should be processed identically. Also, 
we use the index, k rather than n, to denote chrominance columns, since the horizontal chrominance resolution is only 
one quarter of the horizontal luminance resolution. 

40 [0052] As described above, C m n refers to the local orientation class associated with a region whose centroid lies 
between field rows m and m+1. The missing luminance samples, V r 4m+1i/)I Y Am ^ 2l n and V4m+3,/?. are linearly interpo- 
lated based on a line 124 drawn through the missing sample location 126 with the orientation, C m ' n Figure 10 illustrates 
this process for a near-vertical orientation dass of C mn = V + . Note that the original field rows, / 4mn 120and Y Am ^ n 
\72 t must often be horizontally interpolated to find sample values on the end-points 120 of the oriented interpolation 

as lines. In one embodiment, the interpolation filter of equation (1) is used to minimize loss of spatial frequency content 
during this interpolation process. The non-directed orientation class, N, defaults to the same vertical interpolation strat- 
egy as the V class. 

[0053] Chrominance components are treated similarly. Specifically, the missing chrominance sample, Cam+i is lin- 
early interpolated based on a line drawn through the missing sample location with the orientation, C m 2k . Again, chromi- 
50 nance samples from the original field rows, C 2m ^ and C 2m4 2 t h must of * en be horizontally interpolated to find sample 
values on the end-points of the oriented interpolation lines. 

3 Adaptive, Non-Stationary Merging of Interpolated Fields 

55 [0054] In this section the method used to merge spatially interpolated pixels from the auxiliary fields into the reference 
field to produce a high quality still image is described. As mentioned in Section 1 , the merging operation is performed 
independently on each of the overlapping motion blocks illustrated in Figure 1. The image is stitched together from 
these overlapping blocks using a smooth transition function within the overlapping regions. The discussion which fol- 
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lows considers the operations used to merge a single motion block; for simplicity, these operations will be described as 
though this motion block occupied the entire reference field. 

[0055] The merging process is guided by a single sub-pixel accurate motion vector for each of the auxiliary fields. The 
first step involves generation of a merge mask for each auxiliary field, to identify the regions in which this motion vector 
may be considered to describe scene motion between the reference field and the relevant auxiliary field. Merge mask 
generation is discussed in Section 3.1 below. The next step is to assign weights to each spatially interpolated pixel in 
the reference and auxiliary fields, identifying the contribution that each will make to the merged image. This step is dis- 
cussed in Section 3.2 below. Finally, the weighted average of pixels from the various fields is formed using the fast tech- 
nique described in Section 3.3 below. 

3.1 Generation of Merge Masks 

[0056] For any given auxiliary field, the general objective of the merge mask generation sub-system is to determine 
a binary mask value for each pixel in the original reference field, not the interpolated reference field. The mask value for 
a particular pixel, identifies whether or not the motion vector associated with the given auxiliary field correctly describes 
scene motion between the reference and auxiliary fields in the vicinity of that pixel. Our basic approach for generating 
these masks involves computing a directionally sensitive weighted average of neighboring pixels in the reference field 
and corresponding motion compensated pixels in the auxiliary field and comparing these averages. One key to the sue* 
cess of this method is the directional sensitivity of the local pixel averages. 

[0057] To facilitate the ensuing discussion, let y,[/,y] denote the luminance sample at row / and column j in the refer- 
ence field. For convenience, let y a [i,j\ denote the corresponding pixel in the auxiliary field, after compensating for the 
estimated motion vector. Note that motion compensation may involve sub-pixel interpolation, since our motion vectors 
are estimated to sub-pixel accuracy, rf the motion vector correctly describes motion in the vicinity of pixel (/.», it might 
be expected that neighborhood averages around y^ij] and y a [i,j] would yield similar results. One concern is with image 
edges, where the success of subsequent field merging depends critically on motion vector accuracy in the direction per- 
pendicular to the edge orientation. To address this concern, the orientation map, discussed in Section 2.1 is used. Only 
in the special case when the orientation class for pixel (/,/) is rV, i.e. no distinct direction, does the process use a non- 
directional weighted average, whose weights are formed from the tensor product of the seven tap horizontal kernel, 



and the five tap vertical kernel, 



in one particular embodiment of the invention. In this case, if the weighted averages formed around y r [/,y] and y a [/,y] dif- 
fer by more than a prescribed threshold, the merge mask, rrijj is set to 0, indicating that the motion model should not 
be considered valid in this region. 

[0058] For all other orientation classes, the reference and auxiliary fields are first filtered with the three tap horizontal 
low-pass kernel, 

(11!) 



and then four one-dimensional weighted averages, p[/J], a^/'j], o^VJl and a^/.y] are computed. Each of these weighted 
averages is taken along a line oriented in the direction identified by the orientation class for pixel (/,/). using the weights. 



[0059] The oriented line used to form p[/,y] is centered about pixel (/,y) in the reference field. Similarly, the line used 
to form a2[/.y] is centered about pixel (/,y) in the auxiliary field. The lines for a^/.y] and cc 2 [/,y] have centers which fall on 
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either side of pixel (i.j) in the auxiliary field, displaced by approximately half a field row or one field column, as appro- 
priate, in the orthogonal direction to that identified by the orientation class. Thus, in a region whose orientation class is 
uniformly vertical, the result would be a 1 [/.yl = a 2 [/,y-1] = a 3 [/,i-2] . On the other hand, in a region whose orientation 
class is uniformly oblique, Or or 0\ the horizontal average, a^/j] should approximately equal the arithmetic mean of 
5 a 2 [',y] and a 2 [/,/-1]. From these directional averages, three absolute differences are formed, 

*tf»il*|PM-«*MI. * = 1.2,3. 

10 

[0060] If iy.y] exceeds a pre-determined threshold, it is concluded that the motion vector does not describe scene 
motion in the vicinity of pixel (/,/) and mjj is set to 0 accordingly. Otherwise, it is concluded that the motion vector is 
approximately accurate, but the process must still check to see if it is sufficiently accurate for field merging to improve 
15 the quality of an edge feature. Reasoning that a small motion error would generally cause one, but not both of 5i [/*./] and 
8 2 [/J] to be smaller than fyV JL the process tests for this condition, setting to 0 whenever it is found to be true. 

3.2 Generation of Spatial Weighting Factors 

20 [0061] The merging sub-system forms a weighted average between the directionally interpolated pixels from each 
field. This section describes the methodology used to determine relevant weights. The chrominance and luminance 
components are treated in a fundamentally different way, since most of the spatial information is carried only in the lumi- 
nance channel. All chrominance samples in the reference field are assigned a weight of 1 , while chrominance samples 
from the auxiliary fields are assigned weights of either 1 or 0, depending only on the value of the merge mask for the 

25 relevant auxiliary field. In this way, the chrominance components are simply averaged across all fields, except in regions 
where the motion vectors do not reflect the underlying scene motion. This has the effect of substantially reducing 
chrominance noise. Moreover, the fact that most scenes contain at least some inter-field motion, means that field aver- 
aging of the chrominance components tends to cancel color aliasing artifacts, which arise from the harmonic beating of 
scene features with the color mosaic used in single CCD video cameras. 

30 [0062] The same approach could be adopted for merging the luminance components as well; but limitations might 
exist with respect to enhancing spatial frequency content. Although, the directional spatial interpolation technique is 
able to enhance spatial frequency content of oriented edge features, textured regions are entirely dependent upon the 
information from multiple fields for resolution enhancement. In the limit as the number of available fields becomes very 
large, simple averaging of the interpolated fields has the effect of subjecting the original spatial frequencies in the scene 

35 to a low pass filter whose impulse response is identical to the spatial interpolation kernel, ff an "ideai" sine interpolator 
is used to vertically interpolate the missing rows in each field, the result is an image which has no vertical frequency 
content whatsoever beyond the Nyquist limit associated with a single field. In the example embodiment of the invention, 
linear interpolation is used to interpolate the missing field rows prior to merging; and the averaging process does tend 
to eliminate aliasing artifacts, (n order to preserve high spatial frequencies while still removing aliasing and reducing 

40 noise in the luminance components, a space varying weighting function can be adopted. Specifically, each luminance 
sample in any given auxiliary field is assigned a weight of 2 if it corresponds to an original pixel from that field, 1 if it is 
located within one interpolated row (i.e. one quarter of a field row) from an original pixel, and 0 otherwise. If the relevant 
merge mask is zero, then the process sets the weight to 0 regardless of the distance between the sample and an orig- 
inal field sample. The reference field luminance samples are weighted in the same manner, except that all samples are 

45 assigned a weight of at least 1 . in order to provide that at least one non-zero weight is available for every sample in the 
merged image. This weighting policy has the effect of subjecting vertical frequencies to a much less severe lowpass fil- 
ter than simple averaging with uniform weights. 

3.3 Fast Method for Implementing the Weighted Averages 

50 

[0063] This section describes an efficient method used to implement the weighted averaging of interpolated sample 
values from the reference and auxiliary fields. The same technique can be used for both luminance and chrominance 

samples. To facilitate the discussion, let u 1f o 2 u F denote the sample values to be merged from each of F fields to 

form a single luminance or chrominance sample in the final image. Also, let <d 1p ©2,..., g>f denote the corresponding 
55 weighting values, which take on values of 0, 1 or 2 according to the discussion in Section 3.2. The desired weighted 
average may be calculated as 
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u 



r-1 

[0064] This expression involves a costly division operation. To resolve this difficulty, in one embodiment of the inven- 
tion, the process constructs a single 16 bit word, v for each sample. The least significant 9 bits of v f hold the weighted 
10 sample value, u/- <ow f \ the next 3 bits are set to zero; and the most significant 4 bits of v f hold the weight, The 
weighted average is then implemented by forming the sum, 

F 

15 /«1 

and using v as the index to a lookup table with 2 1 6 entries. This technique will be effective so long as the number of 
fields, F, does not exceed 8. Subject to this condition, the least significant 12 bits of v hold the sum of the weighted 
20 sample values and the most significant 4 bits hold the sum of the weights so that a table lookup operation is sufficient 
to recover the weighted average. 

4 Performance 

25 [0065] Although the multi-field enhancement system disclosed in this document may appear to involve numerous 
operations, it should be noted that an efficient implementation need not make exorbitant demands on the computing or 
memory resources of a general purpose computer. This is because the numerous intermediate results required to 
implement the various sub-systems described earlier, may be generated and discarded incrementally on a row-by-row 
basis. Moreover, intermediate results may often be shared among the different sub-systems. Many parameters such as 

30 filter coefficients and dimensions have been selected with a view to implementation efficiency. As an example, to proc- 
ess four full color video fields, each with 240 rows and 640 columns, the system requires a total storage capacity of only 
1 AM 8, almost all of which (0.92/W B) is used to store the source fields themselves. The processing of these four fields 
requires about 8 seconds of CPU time on, for instance, an HP Series 735 workstation operating at 99/tf H z. On a PC 
with a 200M H z Pentium Pro processor, the same operation takes less than 4 seconds of CPU time. Empirical obser- 

35 vations indicate that this multi-field processing system achieves significantly higher still image quality than conventional 
single field enhancement or de-interlacing techniques. Moreover the system appears to be robust to a wide range of 
inter-field motion, from simple camera jitter to more complex motion of scene objects. 



40 



45 



50 



Claims 

1. A method for combining the information from multiple video fields having pixels into an enhanced still image, the 
method comprising the steps of: 

(a) selecting at least one field to serve as a reference field; 

(b) using remaining fields to serve as auxiliary fields; 

(c) constructing a orientation map for the reference field which is used to directionally interpolate the reference 
field; 

(d) constructing a motion map to model the local displacement between features in the reference field ad cor- 
responding features in the auxiliary fields; 

(e) using the motion map to infer an orientation map for each of the auxiliary fields directly from the orientation 
55 map of the reference field ; 

(f) using the orientation maps to directionally interpolate the auxiliary fields to the same resolution as the inter- 
polated reference field; 
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(g) determining a merge mask for each auxiliary field to mask off certain pixels; 

(h) forming a weighted average image from the reference field pixels, and the auxiliary field pixels which have 
not been masked off; and 

5 

(i) horizontally interpolating the weighted average image to form the enhanced still image. 

2. The method of Claim 1. wherein the steps are applied to the luminance components of the video fields. 

10 3. The method of Claim 1, wherein the steps are applied to the chrominance components of the video fields. 

4. The method of Claim 3, wherein the steps are performed on chrominance components with relatively less spatial 
frequency content than corresponding luminance components. 

15 5. The method of Claim 1 , wherein the motion maps in step (d) are computed, using the directionally interpolated ref- 
erence field, to sub-pixel quarter pixel accuracy in the vertical direction and half pixel accuracy in the horizontal 
direction. 

6. The method of Claim 1, wherein step (e) is replaced with: computing an orientation map for each field separately. 

20 

7. The method of Claim 1 , wherein the certain pixels in step (g) correspond to regions where the motion maps fail to 
correctly model the relationship between the reference and auxiliary fields. 

8. The method of Claim 1 , including an additional post-processing operation which sharpens the image formed from 
25 the weighted averaging step (h). 

9. The method of Claim 1 , wherein the horizontal interpolation of step (i) includes a horizontal interpolation factor of 



30 10. A system for combining the information from multiple video fields having pixels into an enhanced still image, the 
system comprising: 

(a) a selection means for selecting at least one field to serve as a reference field, and using the remaining fields 
to serve as auxiliary fields; 

35 

(b) a mapping means for 

(i) constructing an orientation map for the reference field which is used to directionally interpolate the ref- 
erence field; 

40 

(ii) constructing a motion map to model the local displacement between features in the reference field and 
corresponding features in the auxiliary fields; 

(iii) using the motion map to infer an orientation map for each of the auxiliary fields directly from the orien- 
45 tation map of the reference field; 

(c) an interpolation means which uses the orientation maps to directionally interpolate the auxiliary fields to the 
same resolution as the interpolated reference field; 

so (d) a masking means which determines a merge mask for each auxiliary field to mask off certain pixels; 

(e) an averaging means which forms a weighted average image from the reference field pixels, and the auxil- 
iary field pixels which have not been masked off; and 

55 (f) an interpolation means which horizontally interpolates the weighted average image to form the enhanced 

still image. 

11. A system for efficiently estimating motion between two video fields, the system comprising: 
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(a) a means for bandpass filtering each of the 
fields; 

(b) a means for obtaining a two-bit representa- 
tion of each sample in the bandpass filtered 
fields; 

(c) a means for comparing the two-bit repre- 
sentations which are associated with relatively 
displaced regions from the two fields, in order 
to determine an initial coarse motion estimate 
with full-pixel accuracy; and 

(d) a means for refining the initial coarse 
motion estimate to fractional pixel accuracy. 

12. The system of Claim 11, wherein the fields include 
frames. 

13. The system of Claim 11. wherein the bandpass fil- 
tering step (a) is accomplished by taking the differ- 
ence between two moving averages. 

14. The system of Claim 13, wherein the moving aver- 
ages have different different window sizes. 

15. The system of Claim 11, wherein the two-bit repre- 
sentation step (b) includes three states, according 
to whether the bandpass filter sample exceeds a 
positive threshold, falls below a negative threshold, 
or falls between the positive and negative thresh- 
olds. 

16. The system of Claim 11, wherein the two-bit repre- 
sentations for two different bandpass filtered sam- 
ples are two-bit words which are compared in step 
(c) by applying a logical exclusive OR operator to 
the pair of two-bit words and counting the number 
of ones in the resulting two-bit value. 



15 




16 



EP 0 944 251 A1 




17 



EP 0 944 251 A1 



r- CM.'' 



00 



+ 
o 


I 

> 
-I 


I 

o 


+ 
> 
_l 


+ 

Q 


1 

a 
—1 


I 

Q 


+ 

Q 
—J 


+ 
> 


1 

O 


I 

> 


+ 

o 

-J 


> 


1 

o 

+ 
1 

o 

AC 


CM 


O 


HO 



i 



\ 



CO 



\ 



-CM 



-|oo 4 



\ 



\ 



18 



EP 0 944 251 A1 



54 



NEXT FIELD COLUMN 



TARGET FIELD COLUMN 



52 







* 










t ; 


r 








1 


1 








2 : 


2 




FIELD 


ROW 


• i 










1 ; 


1 








2 : 


2 




FIELD 


ROW 


1 

2 


" T 

; 2 




FIELD 


ROW 




: 








1 


; 1 








2 


I 2 















V 



Figure 5 



19 



* 

EPO 944 251 A1 



4. 
2 
3 
_i 

O 
O 

I— Q 

Sill 

ZtL 



CD 



O 
KO 
LU 

(DQ 
CC — I 
< UJ 

HE 



</> o 
o° 

CL U_ 



o 
o 

2m 



> 




+CVJ 

> 


> 


-> 












• * 




o'i 

f 


i 
« 


O \ 


o 




* 

• 


oi 

0 


o I 






0 






. — 0 

/ 




o 


o 




/ g 

o 




ROW 


ROW 




; q 

-J 
111 

U- 




FIELD 


FIELD 




PREVIOUS 


o 

CD 


TARGET 


NEXT 




l 

> 




1 esj 

> 


1 CO 

> 


> 




CO 
3 

g 
> 

UJ 

oc 

CL 



O 
O 

o 
— i 

UJ 



O 
O 
O 



5 
o 



UJ 

u_ 

CO 
3 

g 
> 

LU 
QC 
Ql 



Os 

« 

0/ 

* 

o 

3: 
o 
ac 

o 
i 

UJ 



UJ 
CD 
CC 



O! 



o 
o 

O 



UJ 

u. 



X 

UJ 

z 



\ 



O 
O 
O 



CO 

I 



O 
CD 



20 



EP 0 944 251 A1 



CM 



5 
=> 
_i 
O 
O 

I— Q 

Z El 



\ 



CO 



o 

l-O 

LU 

O Q 
CC —I 
< UJ 

f— u. 



wo 

S3 

Qu U. 



o 
o 



5 
o 

GC 



CO 
=> 

o 
> 

LU 

CC 

o. 



« 



O 
cc 

Q 
—J 
LU 



LU 

CD 



5 
O 
CC 

o 

-J 

LU 



o t- 



O — 



\ 



CM 



O 

o 

UJ UJ 



2 

3 



I <M O 



O - 

H- O 
LU 

oo 

CC -I 
< LU 

I— El 



' — I ^ •.'-1*1 « •u>|«t #M<D. 



CO o 

S3 



O 

o/ 



• • • ■ 



CC 



LU 

U- 

CD 

=> 
o 

> 

LU 
CC 



O 

GC 



UJ 



UJ 

O H- 



5 

o 



UJ 
LL 



X 
UJ 



>co|<*> 



O 
O 



CO 



21 



EP 0 944 251 A1 



eg 

CO 



O 
O 

h- Q 

2^ 



2 
Z3 
—I 
O 

y— o 

UJ 

o Q 
cc -I 

< LU 



5 

8° 

S3 
Sty 

Q. U. 



\ 

\ >~ , ° 
o 

0 -I- 



o 

cc 

Q 
—I 
UJ 



LU 

o I- 



O 



UJ 

UL 

h- 
X 
UJ 



I 



CM 
CO 



O 
O 

I— Q 

X -J « 
UJUJ 

2 U. 

2 



O 
H- O 
LU 

O Q , 

oc i ; 

<lu 

' u_ : 



2 

CO O 

1° 

>Q . 

Ill -J 

cc\a 



I to 



» 

* 

*;4 -i- 



o 



o 

CC 

a 
_i 

LU 



UJ 

O 
oc 
< 



o 

00 



O 

OC 

a 

UJ 



X 
UJ 



3 



22 



EP 0 944 251 A1 




TARGET COLUMN ^. 92 
I 

n-2 n-1 n n+1 n+2 

O O O -ROW m-1 -s 

MOO 



O • O O -TARGET ROW - m 



O O O O O - ROW M+1 
' — > — ' ' — < — ' L102 
96 98 C 



Figure 9 




Figure 10 



23 



r 



EP0 944 251 A1 




European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 98 11 6451 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, where appropriate, 
of relevant passages 



Relevant 
to claim 



CLASSIFICATION OF THE 
APPLICATION (lnl.CI.6) 



US 5 341 174 A (WAIOWIT ERIC ET AL) 
23 August 1994 (1994-08-23) 

* the whole document * 

EP 0 785 683 A (SHARP KK) 
23 July 1997 (1997-07-23) 

* abstract * 

NATARAJAN B ET AL: "LOW-COMPLEXITY 
BLOCK-BASED MOTION ESTIMATION VIA ONE-BIT 
TRANSFORMS" 

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS 
FOR VIDEO TECHNOLOGY, 

vol. 7, no. 4, 1 August 1997 (1997-08-01), 
pages 702-706, XP000694623 
ISSN: 1051-8215 

* paragraphs '0002! , '0003! * 

LEE S ET AL: "TWO-STEP MOTION ESTIMATION 
ALGORITHM USING LOW-RESOLUTION 
QUANTIZATION" , PROCEEDINGS OF THE 
INTERNATIONAL CONFERENCE ON IMAGE 
PROCESSING (IC, LAUSANNE, SEPT. 16 - 19, 
1996, VOL. VOL. 3, PAGE(S) 795 - 798 , 
INSTITUTE OF ELECTRICAL AND ELECTRONICS 
ENGINEERS XP000704130ISBN: 0-7803-3259-8 

* paragraph '0002! * 

EP 0 454 442 A (CANON KK) 
30 October 1991 (1991-10-30) 

* column 3, line 6 - column 4, line 39 * 

WO 95 25404 A (HENOT JEAN PIERRE 
;TELEDIFFUSION FSE (FR); FRANCE TELECOM 
(FR)) 21 September 1995 (1995-09-21) 

* abstract * 

_/-- 



1,10 



1,10 



H04N5/44 
H04N1/00 



11,12,16 



11,12,16 



TECHNICAL FIELDS 
SEARCHED (lnLCI.6) 



H04N 



11 



11 



The present search report has been drawn up tor all claims 



Place of search 

THE HAGUE 



Dote oi coaptation of th« search 

30 July 1999 



Examiner 

Yvonnet, J 



CATEGORY OF CITED DOCUMENTS 

X : particularly relevant If taken alone 

Y : particularly relevant V combined with another 

document of the same category 
A : technological background 
O : non-written disclosure 
P : intermediate document 



T : theory or principle underlying the invention 
E : earlier patent document, but published on, or 

after the tiling date 
0 : document cited in the application 
L : document oted for other reasons 



& : member of the same patent family, corresponding 
document 



24 



EP 0 944 251 A1 



European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 98 11 6451 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, where appropriate, 
ol relevant passages 



Relevant 
to claim 



CLASSIFICATION OF THE 
APPLICATION <lntCI6) 



OGURA E ET AL: "A COST EFFECTIVE MOTION 
ESTIMATION PROCESSOR LSI USING A SIMPLE 
ANDEFFICIENT ALGORITHM" 
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 
vol. 41, no. 3, 

1 August 1995 (1995-08-01), pages 690-696, 

XP000539525 

ISSN: 0098-3063 

SIU-LE0NG IU: "COMPARISI0N OF MOTION 
COMPENSATION USING OIFFERENT DEGREES OF 
SUB- PIXEL ACCURACY FOR 
INTERFIELD/INTERFRAME HYBRID CODING OF 
HDTV IMAGE SEQUENCES" , MULTIDIMENSIONAL 
SIGNAL PROCESSING, SAN FRANCISCO, MAR. 23 
- 26, 1992, VOL. VOL. 3, NR. CONF. 17, 
PAGE(S) 465 - 468 , INSTITUTE OF 
ELECTRICAL AND ELECTRONICS ENGINEERS 
XP000378969ISBN: 0-7803-0532-9 



TECHNICAL FIELDS 
SEARCHEO <1»LCI.6) 



The present search report has been drawn up lor all claims 



Pbco o* search 

THE HAGUE 



Data of compUbon of ih« search 

30 July 1999 



Yvonnet, J 



CATEGORY OF CITED DOCUMENTS 

X : particularly relevant ff taken alone 

Y : particularly relevant if combined with another 

document of the same category 
A : technological background 
0 : non-written disclosure 
P : intermediate document 



T : theory or principle underlying the invention 
E : earlier patent document, but published on, or 

after the filing date 
D : document cited in the application 
L : document ctted for other reasons 



& : member of the same patent family, corresponding 
document 



25 



EP 0 944 251 A1 



European Patent Application Number 

olito EP 98 11 6451 



CLAIMS INCURRING FEES 



The present European patent application comprised at the time of filing more than ten claims. 

□ Only part of the claims have been paid within the prescribed time limit The present European search 
report has been drawn up for the first ten claims and for those claims for which claims fees have 
been paid, namely clalm(s): 



□ No claims fees have been paid within the prescribed time limit. The present European search report has 
been drawn up for the first ten claims. 



LACK OF UNITY OF INVENTION 



The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups of inventions, namely: 



see sheet B 



All further search fees have been paid within the fixed time limit The present European search report has 
been drawn up for all claims. 



| I As all searchable claims could be searched without effort justifying an additional fee, the Search Division 
I — I did not invite payment of any additional fee. 



□ Only part of the further search fees have been paid within the fixed time limit The present European 
search report has been drawn up for those parts of the European patent application which relate to the 
inventions in respect of which search fees have been paid, namely claims: 



□ None of the further search fees have been paid within the fixed time limit The present European search 
report has been drawn up for those parts of the European patent application which relate to the invention 
first mentioned In the claims, namely claims: 



26 



EP 0 944 251 A1 



European Patent LACK OF UNITY OF INVENTION AppUc * lon Numbar 

omce SHEET B EP 98 11 6451 



The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several Inventions or groups of inventions, namely: 

1. Claims: 1-10 

method and system for combining the information from 
multiple fields into a still image 



2. Claims: 11-16 

system for estimating motion between two fields 



27 



EP 0 944 251 A1 



ANNEX TO THE EUROPEAN SEARCH REPORT 
ON EUROPEAN PATENT APPLICATION NO. 



EP 98 11 6451 



This annex lists the patent family members relating to the patent documents cited in the above-mentioned European search report. 
The members are as contained in the European Patent Office EDP file on 

The European Patent Office is in no way liable for these particulars which are merely given tor the purpose of information. 

30-07-1999 



Patent document 
cited in search report 



Publication 
date 



Patent family 
member(s) 



Publication 
date 



US 5341174 
EP 0785683 



A 
A 



23-08-1994 



NONE 



23-07-1997 



OP 
OP 
CN 
US 



9200575 A 
9275541 A 
1168052 A 
5832143 A 



EP 0454442 



30-10-1991 



OP 
OP 
0E 
DE 
US 



2811909 B 
4010176 A 
69128211 0 
69128211 T 
5189513 A 



W0 9525404 



21-09-1995 



FR 



2717648 A 



31-07-1997 
21-10-1997 
17-12-1997 
03-11-1998 



15-10-1998 
14-01-1992 
02-01-1998 
02-04-1998 
23-02-1993 



22-09-1995 



a For more details about mis annex : see Official Journal of the European Patent Office. No. 1 2/82 



28 



